Auditory Scene Analysis: Computational Models

نویسنده

  • Guy J. Brown
چکیده

Human listeners have a remarkable ability to separate a complex mixture of sounds into discrete sources. The processes underlying this ability have been termed ‘auditory scene analysis’ (Bregman 1990; this volume). Recently, an interdisciplinary field known as ‘computational auditory scene analysis’ (CASA) has emerged which aims to develop computer systems that mimic this aspect of hearing (Rosenthal and Okuno 1998). Work in CASA is motivated both by a desire to understand the mechanisms of auditory perceptual organisation, and by a demand for practical sound separation devices. Currently, automatic speech recognizers perform badly in noisy acoustic environments: it is likely that their performance could be improved by integrating CASA with speech recognition technology. Other applications of CASA include hearing prostheses and music analysis. This entry considers three general classes of CASA system, and discusses their relative merits. Evaluation techniques for CASA are described, and outstanding challenges in the field are identified.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auditory Scene Analysis: Computational Models

Listeners have to make sense of a complex acoustic world containing overlapping sound sources that must be organized into individual auditory objects. Computational auditory scene analysis concerns the use of algorithms inspired by human sound perception whose aim is to extract properties of constituent sound sources in a complexmixture. Starting with representations based on models of how soun...

متن کامل

The auditory organization of speech and other sources in listeners and computational models

Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process. In the first stage, sound is decomposed both within and across auditory nuclei. Subsequent processes of perceptual organisation are informed...

متن کامل

Title : The auditory organization of speech and other sources in listeners and computational models

Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process: In the first stage sound is decomposed into collections of fragments in several dimensions. Subsequent processes of perceptual organization ...

متن کامل

Connectionist Models for Auditory Scene Analysis

Although the visual and auditory systems share the same basic tasks of informing an organism about its environment, most connectionist work on hearing to date has been devoted to the very different problem of speech recognition . VVe believe that the most fundamental task of the auditory system is the analysis of acoustic signals into components corresponding to individual sound sources, which ...

متن کامل

Bregman's Chimerae: Music Perception as Auditory Scene Analysis

Research into the perception and cognition of music listening often contains implicit assumptions about the nature of the underlying mental representations, and about the relationship between "auditory processing" and "music perception". We attempt to highlight and problemitize some of these assumptions and to provide a more cognitively appropriate model for music perception and cognition, base...

متن کامل

Underconstrained Stochastic Representations for Top-down Computational Auditory Scene Analysis

Since Bregman published his unifying account of psychological results in auditory organization, Auditory Scene Analysis [1], there has been a series computational models of these principles. The dominant approach, as embodied in the dissertations of Cooke [2], Mellinger [3] and Brown [4], and elsewhere [5], may be characterized as follows: First the sound is processed by a conventional signalpr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999